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ABSTRACT 

The  homogenized  energy  framework  quantifying  ferroelectric  and  ferromagnetic  hysteresis  is  increasingly  used 
for  comprehensive  material  characterization  and  model-based  control  design.  For  operating  regimes  in  which 
thermal  relaxation  mechanisms  and  stress-dependencies  are  negligible,  existing  algorithms  are  sufficiently  effi¬ 
cient  to  permit  device  optimization  and  the  potential  for  real-time  control  implementation.  In  this  paper,  we 
develop  algorithms  employing  lookup  tables  which  permit  the  high  speed  implementation  of  formulations  which 
incorporate  relaxation  mechanisms  and  electromechanical  coupling.  Aspects  of  the  algorithms  are  illustrated 
through  comparison  with  experimental  data. 


1  Homogenized  Energy  Framework 

Ferroelectric  and  ferromagnetic  materials  are  being  considered  as  transducers  for  an  increasing  number  of  ap¬ 
plications  due  to  their  broadband  capabilities,  large  electromechanical  and  magnetomechanical  coupling  factors, 
and  their  dual  capability  for  actuating  and  sensing.  At  low  drive  levels,  the  direct  and  converse  electromechani¬ 
cal/magnetomechanical  effects  are  approximately  linear,  and  linear  models  and  control  designs  can  be  employed. 
However,  at  the  moderate  to  high  drive  levels  where  the  unique  transducer  capabilities  are  manifested,  the  con¬ 
stitutive  material  properties  are  inherently  nonlinear  and  hysteretic.  Material  characterization  necessitates  the 
development  of  models  which  accurately  characterize  constitutive  nonlinearities  and  hysteresis  whereas  device 
optimization  and  real-time  control  implementation  requires  that  the  models  be  highly  efficient  to  implement. 
While  a  number  of  frameworks  have  been  developed  for  characterizing  ferroelectric  and  ferromagnetic  hysteresis, 
the  competing  requirements  of  accuracy  and  efficiency  limit  which  models  may  be  considered  for  both  material 
characterization  and  real-time  implementation. 

At  the  microscopic  scale,  the  physical  mechanisms  which  produce  hysteresis  in  ferroelectric  and  ferromagnetic 
materials  differ  substantially  since  ferroelectricity  is  due  to  the  ionic  structure  of  materials  and  ferromagnetism 
results  from  interactions  between  magnetic  moments  and  electron  spins.  However,  at  the  domain  or  macroscopic 
scale,  shared  physical  and  energy  properties  permit  the  development  of  unified  frameworks  for  characterizing 
hysteresis  of  compounds  (see  [14]  for  details  regarding  shared  properties  of  ferroic  materials  at  the  various 
scales) . 

In  addition  to  the  dielectric  and  magnetic  hysteresis  exhibited  by  the  compounds,  transducer  models  must 
characterize  the  thermal  relaxation  effects  and  stress-dependencies  exhibited  by  ferroelectric  and  ferromagnetic 
materials.  The  former  phenomenon  is  illustrated  in  Figure  1(a)  with  magnetic  data  collected  from  a  steel  rod  [3]. 
Stress  effects  are  illustrated  in  Figure  1(b)  via  PLZT  data  from  [9]. 

As  detailed  in  [14],  several  frameworks  have  been  developed  to  characterize  ferroelectric  and  ferromagnetic 
hysteresis  for  regimes  in  which  relaxation  mechanisms  and  stress-dependencies  are  negligible.  These  include 
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Figure  1:  (a)  Magnetization  data  from  a  steel  rod  illustrating  the  effect  of  magnetic  after-effects  on  biased  minor 
loops  [3].  (b)  PLZT  data  showing  the  effect  of  stress  on  polarization  [9]. 


domain  wall  models  [7,  10,  17],  Preisach  models  [5,  12,  13,  21],  and  homogenized  energy  models  [11,  15, 16,  18,  20]. 
Whereas  certain  facets  of  relaxation  and  stress-dependence  have  been  incorporated  in  domain  wall  models  and 
Preisach  models  -  e.g.,  [4]  and  [6]  comprehensive  theory  incorporating  these  mechanisms  for  general  compounds 
has  not  been  developed  within  these  two  frameworks.  Hence  we  focus  on  the  homogenized  energy  framework 
which  naturally  incorporates  thermal  relaxation  and  direct  electromechanical/magnetomechanical  effects  due  to 
the  energy  basis  of  the  framework. 

As  detailed  in  [1,  2,  14,  19,  20],  the  general  homogenized  energy  model  for  ferroelectric  materials  is 

/>oo  /*oo 

[P{E1a)](t)=  /  /  [P{E  +  (1) 

J  0  J  —  oo 

where  E  and  a  denote  input  electric  fields  and  stresses,  P  is  the  macroscopic  polarization,  Ec  and  Ej  are  local 
coercive  and  interaction  fields,  and  uc,  vj  are  densities  which  incorporate  the  effects  of  material  nonhomogeneities, 
polycrystallinity,  and  variable  effective  fields  Ee  =  E  +  Ej.  It  is  shown  in  subsequent  sections  that  the  kernel 
P  follows  from  direct  energy  minimization  when  thermal  relaxation  effects  are  negligible  or  from  Boltzmann 
principles  if  relaxation  is  significant. 

The  magnetic  model  is  analogous.  For  magnetic  field  IF,  magnetization  M,  coercive  field  Hc  and  interaction 
field  Hi ,  the  model  is 

POO  POO 

[M(H,a)](t)=  /  +  (2) 

J  0  J  —  oo 

In  both  cases,  it  is  assumed  that  the  densities  vc  and  vj  satisfy  the  criteria 

vc(x)  defined  for  x  >  0, 

vi{-x)  =  vi(x),  (3) 

\vc{x)\  <  Cie~aiX,  \vi(x)\  <  c2e~a2X 

for  positive  ci,  01,02,02-  These  assumptions  enforce  the  physical  properties  that  the  local  coercive  fields  are 
positive,  low-field  Rayleigh  loops  are  symmetric,  and  local  coercive  and  interaction  fields  decay  as  a  function  of 
distance. 

For  implementation  purposes,  either  Gaussian  or  Newton-Cotes  quadrature  routines  are  used  to  approximate 
the  integrals  in  (1)  or  (2)  -  see  Chapter  8  of  [14].  To  illustrate,  let  Ec[i\,i  =  1, . . . ,  Nc,  and  Ei[j],j  =  1, . . . ,  IV/ , 
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denote  the  abscissas  (quadrature  points)  and  wc\i\,  wj[j\  denote  the  respective  quadrature  weights.  Approxima¬ 
tion  of  the  integrals  in  (1)  then  yields  the  discretized  model 

Nc  Nj 

[P(E,  <r)](t)  =  Y  Y^P^E  +  Ei[j],a;  Ec[i])](t)uc(Ec[{\)i>I(EI[j])wc[i]wI[j^ 

*= 1  3= 1 

(4) 

Nc  N; 

=  EE^  +  Ei\j],a;Ec[i])](t)wc[i]wi\j] 

*= 1  3= 1 

where  u>c[i]  =  u;c[i]i'c(£lc[i])  and  w/[j]  =  Wi[j]vi{Ei[j}).  The  discretization  of  the  magnetization  model  is 
similar.  We  note  that  efficient  implementation  algorithms  are  required  when  large  quadrature  limits  Nc  and  Nj 
are  dictated  by  accuracy  requirements. 

The  analogy  between  the  polarization  model  (1)  and  magnetization  model  (2)  illustrates  the  manner  in  which 
the  framework  characterizes  hysteresis  in  the  combined  class  of  materials  -  see  [14,  19].  To  simplify  subsequent 
discussion,  we  focus  primarily  on  the  polarization  models  (1)  and  (4)  and  note  that  analogous  results  follow 
for  the  ferromagnetic  model.  Aspects  of  both  models  are  illustrated  in  examples  and  through  comparison  with 
experimental  data  in  Section  5. 

The  technique  for  constructing  the  kernel  P  can  be  summarized  as  follows.  We  first  construct  Gibbs  energy 
relations  G  comprised  of  a  Helmholtz  energy  ip,  which  quantifies  the  internal  energy  associated  with  a  continuum 
of  dipole  or  moment  configurations,  and  elastic,  electrostatic,  or  magnetostatic  work  relations.  For  regimes  in 
which  thermal  relaxation  is  negligible,  P  is  determined  by  minimizing  G  with  respect  to  P.  Physically,  this 
can  be  interpreted  as  specifying  the  polarization  that  results  when  dipoles  reorient  in  response  to  an  applied 
field.  To  incorporate  thermal  relaxation  mechanisms,  the  Gibbs  and  relative  thermal  energies  are  balanced  using 
Boltzmann  principles.  As  detailed  in  Section  2.6  of  [14],  this  is  equivalent  to  minimizing  a  combined  measure  of 
the  internal  and  relative  thermal  energies. 

In  Section  2  we  summarize  the  construction  of  appropriate  energy  functionals  and  formulation  of  the  kernel 
P.  Whereas  the  model  construction  has  appeared  in  the  literature  -  e.g.,  see  [1,  2,  14,  19]  -  we  reformulate 
various  components  to  facilitate  the  construction  of  highly  efficient  implementation  algorithms  through  the  use 
of  lookup  tables.  These  algorithms  are  presented  in  Section  3,  and  in  Section  4  we  illustrate  that  computation 
times  can  be  reduced  by  up  one  to  four  of  magnitude  when  implementing  formulations  which  incorporate  thermal 
relaxation  and/or  stress-dependencies.  In  Section  5  we  illustrate  the  accuracy  of  the  framework  for  characterizing 
the  data  in  Figure  1. 

2  Energy  Relations  and  Construction  of  the  Kernel  P 

We  summarize  in  Section  2.1  appropriate  Helmholtz  and  Gibbs  energy  relations  for  stress-invariant  regimes 
which  involve  180°  switching  and  stress-dependent  regimes  with  both  180°  and  90°  switching.  In  Section  2.2  we 
detail  the  manner  through  which  the  Gibb’s  energy  is  used  to  construct  kernels  P  in  the  absence  or  presence  of 
thermal  activation  which  produces  relaxation  phenomenon.  Issues  relating  to  the  implementation  of  the  kernels 
are  presented  in  Section  2.3,  and  various  kernel  constructions  are  compared  from  a  computational  perspective  in 
Section  2.4. 

2.1  Helmholtz  and  Gibbs  Energy  Relations 

2.1.1  Stress-Invariant  Regimes:  180°  Switching 

As  detailed  in  [14,  20],  an  appropriate  Helmholtz  energy  relation  for  temperature-invariant,  stress-invariant 
operating  regimes  with  strictly  180°  switching  is 
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(a)  (b) 


Figure  2:  (a)  Helmholtz  energy  for  the  stress-invariant  case,  and  (b)  Helmholtz  energy  for  the  stress-dependent 
case. 


m  = 


77(P  +  Pr)2/2, 

P  <  Pi 

!<p'  -  p*>  (5  -  p«)  ’ 

,  \P\<Pi  , 

(5) 

77(P-Pr)2/2, 

P  >  Pi 

where  Pi  denotes  the  positive  inflection  point,  PR  is  the  local  remanence  value,  and  rj  is  the  reciprocal  slope 
It  is  shown  in  Section  2.2  that  using  this  piecewise  quadratic  approximation  for  the  energy  results  in  a  piecewise 
linear  kernel  where  rj  is  the  reciprocal  slope  of  the  positive  and  negative  kernel  branches.  This  energy  relation  is 
plotted  in  Figure  2(a). 

The  Helmholtz  energy  (5)  quantifies  the  internal  energy  associated  with  positive  and  negative  dipole  orien¬ 
tations.  The  Gibbs  energy  relation 

G(E,  P)  =  4>(P)  —  EP  (6) 


balances  this  internal  energy  with  the  electrostatic  energy  or  work  performed  by  an  applied  external  field.  As 
detailed  [14],  where  the  Legendre  transform  properties  of  G  are  discussed,  care  must  be  taken  to  interpret  E  as 
the  independent  variable  and  P  as  the  dependent  variable. 


2.1.2  Stress-Dependent  Regimes:  180°  and  90°  Switching 


The  Helmholtz  relation  (5)  was  derived  under  the  assumption  that  dipoles  are  aligned  either  in  the  field  direction 
or  diametrically  opposite  to  it  -  this  characterizes  the  internal  energy  associated  with  field-induced  180°  switching. 
As  illustrated  in  Figure  3,  the  application  of  stresses  perpendicular  to  the  poling  direction  additionally  produces 
90°  switching.  It  is  illustrated  in  [1,  2]  that  whereas  the  strict  incorporation  of  90°  switching  requires  a  2-D 
or  3-D  energy  landscape,  reasonable  approximations  can  be  obtained  for  a  number  of  applications  with  1-D 
Helmholtz  relations  whose  minima  correspond  with  180°  and  90°  dipole  positions,  as  shown  in  Figure  2(b).  We 
summarize  this  latter  case. 

Following  the  development  in  [1,  2],  the  Helmholtz  energy  quantifying  the  internal  energy  associated  with 
the  three  dipole  states  is  taken  to  be 


ipp{P)  =  < 


V(P  +  PR)2/2, 
m(P  +  Pm)2/ '2  +  0, 

772P2/2  +  A, 
m{P-Pm)2/2  +  p, 
r,(P  -  PR)2/ 2, 


P  <  -Pi 

-Pi  <  P  <  —P901 
\P\  <  P901 
P901  <  P  <  Pi 
P  <  -Pi 


(7) 
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(a)  (b)  (c) 


Figure  3:  Possible  polarization  states  of  a  single  crystal  including  (a)  the  state  with  0  applied  field,  (b)  the  180° 
switch  which  occurs  with  a  sufficiently  strong  negative  field,  and  (c)  a  90°  switch  induced  by  stress. 

where 

p  _  y{Pi  -  Pr)P9qi  ~  V2P901P1  _  Pi  -  Pr 

m~  y{P!  -  PR)  -  r,2P90I  ’  m  ~  V  Pi  -  Pj 

Pr)2  P-)2’  A  =  f  (p90/  -  Pmf  +P-  |  P9V 

The  parameters  Pgoi  and  772  are  analogous  to  Pi  and  77,  except  they  apply  to  dipoles  that  are  normal  to  the  applied 
field  (i.e. ,  in  a  90°  orientation).  Note  that  due  to  the  one  dimensional  context,  we  consider  all  normal  directions 
as  the  same  state.  This  distinguishes  this  from  a  true  2-D  approach  but  greatly  simplifies  the  computation. 

The  energy  relation  (7)  incorporates  the  internal  energy  associated  with  dipole  switching  but  neglects  electro¬ 
mechanical  coupling  and  elastic  contributions.  Electromechanical  coupling  can  be  quantified  by  the  relation 

ipes{P,  e)  =  -aeP  -  qeP2,  (8) 

where  a  is  the  piezoelectric  coupling  coefficient,  q  is  the  electrostrictive  coupling  coefficient,  and  £  is  the  strain. 
The  elastic  energy  for  uniaxial  regimes  is 

^ei{e)=^Y£2  (9) 

where  Y  is  the  Young’s  modulus  of  the  material.  The  total  Helmholtz  energy  is  thus 

4’(P,e)  =  tpp(P)  +  4>es(P,e)  +  V’ez(e)-  (10) 

The  associated  Gibb’s  energy  is 

G(E,P,<j,e)  =  ip(P1e)-EP~a£  (11) 

where  a  is  the  stress.  As  detailed  in  [1,  2],  enforcement  of  the  quilibrium  condition  ^  =  0  allows  the  strain  to 
be  written  as 

£  =  Y~1(a  +  aP  +  qP2).  (12) 

Substituting  into  (11)  yields  a  Gibb’s  formulation  posed  solely  in  terms  of  the  input  field  and  stress. 

2.2  Relations  for  the  kernel  P 

2.2.1  Stress-Invariant  Regimes:  180°  Switching 

For  regimes  in  which  thermally  activated  relaxation  is  negligable,  direct  minimization  of  (6)  yields  the  kernel 

~P  =  —  +  PrS  (13) 

n 


5 


where  (5=1  for  positively  oriented  dipoles  and  <5  =  — 1  for  negatively  oriented  dipoles.  From  (13),  it  follows  that 
the  local  coercive  field  Ec,  local  remanence  polarization  PR,  and  inflection  point  Pj  are  related  by  the  expression 

Pi  =  Pr~^.  (14) 

V 

Thermal  activation  is  manifested  by  dipoles  having  sufficient  thermal  energy  to  switch  states  before  a  minima 
of  the  Gibbs  energy  is  eliminated.  To  quantify  this,  the  Boltzmann  relation 

p(G)  =  Cexp(-G{E,  P)V/kT)  (15) 


balances  the  Gibbs  and  relative  thermal  energies,  where  V  is  a  reference  volume,  T  is  the  temperature  in  degrees 
Kelvin,  and  k  is  Boltzmann’s  constant.  The  constant  C  is  a  chosen  to  ensure  integration  to  unity.  It  is  shown  in 
[14,  20]  that  the  resulting  kernel  is 

P  =  x+  (P+)  +  (1  -  *+)  (P_>  (16) 

where  x+  denotes  the  fraction  of  positively  oriented  dipoles  and 


(■ P+)  = 


/“Pexp(-M-) 

Jp>  P(^W^)< 


dP 


(P_)  = 


f-x  PexP  ( 


-G(E,P)V 

kT 


dP 


O  p{=^)dp 


(U) 


are  the  average  polarizations  associated  with  positive  and  negative  dipole  orientations.  The  evolution  of  dipole 
fractions  is  governed  by  the  differential  equation 


x+  =  -P+-X+  +P-+(1  -  x+),  (18) 

where  the  likelihoods  p _| _ and  p _ (_  of  a  dipole  switching  from  positive  to  negative,  or  vice-versa,  are 

n,  =  _ ' _ 5 _ 1 _  n,  =  _ : _ 5 _ 1 _  (19) 

T(T>fZe*p(=e!%m)dP’  _+ 

Here  r  is  the  material  and  temperature-dependent  relaxation  time  and  e  is  a  small,  positive  constant  (its  purpose 
is  detailed  in  Section  2.3.1). 

For  implementation  purposes,  a  backward  Euler  discretization  can  provide  sufficient  accuracy  when  solv¬ 
ing  (18)  if  stepsizes  are  reasonably  small.  The  integrals  in  (17)  and  (19)  can  be  solved  via  quadrature  routines. 
However,  this  is  not  necessary  in  this  case,  since  it  is  shown  in  Section  2.3.2  that  they  can  be  formulated  as  a 
combination  of  exponential  and  error  functions.  These  functions  are  provided  by  most  mathematical  libraries 
and  generally  provide  better  performance  than  direct  evaluation  in  both  computation  time  and  accuracy. 

2.2.2  Stress-Dependent  Regimes:  180°  and  90°  Switching 

The  kernel  development  for  the  stress-dependent  case  is  analogous  but  employs  the  Gibbs  relation  (11).  Due  to 
the  definition  (7)  in  the  Helmholtz  energy,  4  switching  points  occur:  from  either  positive  or  negative  to  90°  and 
from  90°  to  either  180°  orientation.  We  let  Ec  denote  the  field  at  which  the  polarization  switches  to  a  positive 
orientation  and  specify  the  transition  to  a  ninety  degree  orientation  through  the  independent  parameter  Pj.  The 
input  electric  field  levels  at  which  the  dipole  switches  to  90°  orientations  occur  at 

£+90  =  -rr(2 q2pf  +  3aqP f  +  ( a 2  -  r]Y  +  2qa)Pi  +  aa  +  pY PR), 

l  (20) 

E- 90  =  —(2 q2Pf  -  3aqPj  +  ( a 2  -  pY  +  2qa)PI  -  aa  -  pYPR). 

Note  that  this  implies  180°  switches  will  always  be  symmetric  whereas  if  linear  electromechanical  coupling  is 
significant,  the  90°  switches  may  not  be  symmetric.  This  was  chosen  to  match  behavior  observed  in  stressed 
ferroelectric  materials. 
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For  regimes  in  which  thermal  activation  is  negligible,  the  roots  of  ^  may  be  obtained  numerically  for  each 
of  the  cases  P  >  Pj,  P  <  — Pi  and  |P|  <  Pgo/-  While  it  is  possible  that  more  than  one  real  root  may  exist,  only 
one  is  typically  within  a  valid  range  for  P. 

To  incorporate  thermal  activation,  we  again  balance  (11)  with  relative  thermal  energy  via  the  Boltzmann 
relation  (15).  This  yields  three  average  polarization  relations 


(P+) 


(21) 


corresponding  to  the  three  dipole  states.  Since  dipoles  have  three  orientations,  we  must  keep  track  of  the  fraction 
oriented  positively  x+  and  negatively  x_ .  The  third  fraction  is  given  by  Xgg  =  1  —  x+  —  .  This  resulting  kernel 

is 


P  =  x+  (P+)  +  X-  (P_)  +  (1  -  x+  -  X-)  (P90) . 


(22) 


The  evolution  of  dipole  fractions  is  simplified  by  assuming  that  a  dipole  may  only  switch  to  an  adjacent  state; 
i.e. ,  a  negative  dipole  must  switch  to  90°,  not  directly  to  positive.  This  yields  the  likelihood  relations 


P+90  = 


P90+  = 


v-)dp 


exp  ( 


-G(E,P)V 

kT 


dP 


f(T)  fp;  l 

( -G(E,P)V' 

K  kT  . 

)  dP  ’ 

P- 90  —  p 

t(T)  f_J  exp  ( 

f  -G{E,P)VS 
K  kT  , 

)  dP 

r^907  „  ( 

JPooi-  eeXP( 

-G(E,P)V\ 

kT  J 

dP 

I-P90,  exP  1 

’-G(E,P)V\ 
.  kT  ) 

dP 

t(t)  j: 

and  the  differential  equation 


P901 

P901  exp 


dP 


P  90—  = 


r_ 


P9  0/  _ 
P90,  ^ 


^-G(EjP)V^ 


dP 


(23) 


X- 

_ 

— P-90  —  P90- 

-P90- 

x_ 

+ 

P90— 

x+ 

P90+ 

P+90  —  P90+ 

.  x+  . 

P90+ 

governing  the  evolution  of  dipole  fractions.  Again,  a  backward  Euler  method  typically  provides  sufficient  accuracy 
while  minimizing  computation.  We  note  that  unlike  the  stress-free  case,  the  integrals  in  (21)  and  (23)  cannot  be 
reduced  to  error  functions  and  exponentials  and  hence  must  be  approximated  with  a  quadrature  method. 


2.3  Theoretical  and  Practical  Implementation  Details 

We  summarize  here  issues  pertaining  to  the  construction  of  components  arising  in  the  kernel  definitions  and 
implementations. 

2.3.1  Construction  of  Likelihood  Relations 

The  relations  (19)  and  (23)  quantify  the  likelihood  that  dipoles  achieve  the  relative  thermal  energy  required 
to  reorient  in  advance  of  fields  required  to  eliminate  minima  of  the  Gibbs  energy.  As  detailed  in  Section  2.6 
of  [14],  this  is  initially  formulated  as  a  sum  of  the  discrete  set  of  dipole  states  with  the  continuum  assumption 
and  subsequent  integration  performed  to  facilitate  evaluation  of  the  likelihoods.  For  the  discrete  set  of  states, 
the  likelihoods  are  computed  by  point  evaluation  of  G  at  Pj.  Under  the  assumption  of  a  continuum,  however, 
point  evaluation  yields  probabilities  and  hence  likelihoods  of  0.  Thus,  care  must  be  taken  when  formulating 
appropriate  likelihood  relations. 

One  approach  is  to  employ  the  formulations  (19)  and  (23)  where  e  is  chosen  to  be  a  small  positive  constant. 
The  resulting  likelihood  relations  are  well-posed  but  have  the  disadvantage  of  requiring  an  additional,  nonphysical 
parameter  e. 
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Alternatively,  one  can  approximate  the  integrals  in  (19)  and  (23)  with  Riemann  sums  and  incorporate  e  into 
the  relaxation  time  r  to  obtain  the  expressions 

exp  (-ajar)  eXp(^ar) 

t(T)  /”  exp  ( dP '  P’+  r(r)/_-;'exp(^M)dp' 

This  avoid  the  uncertainty  in  the  choice  of  e  and  is  advantageous  in  2-D  and  3-D  models  where  the  integrals 
become  surface  or  volume  integrals.  For  stress-invariant  one-dimensional  models,  both  likelihood  relations  may 
be  expressed  in  terms  of  the  complementary  error  function.  While  either  approach  is  reasonable  and  yields 
similar  results,  we  employ  the  former  approach  in  subsequent  algorithms. 

Either  approach  for  determining  likelihood  may  still  yield  the  nonphysical  situation  that  a  dipole  has  not 
switched  even  after  its  input  Ee  has  passed  the  coercive  point.  The  likelihood  calculation  is  intended  to  allow 
dipoles  to  switch  ahead  of  this  point.  However,  for  some  parameter  values  it  may  dictate  that  dipoles  attempt  to 
switch  infrequently,  possibly  not  until  after  the  field  has  passed  the  coercive  point.  To  remove  this  possibility,  we 

employ  (19)  only  when  Ee  <  Ec  (for  p _ |_)  or  Ee  >  —Ec  (for  p. ( _ ).  For  other  values,  the  likelihood  is  defined  to 

be  Analogous  definitions  are  employed  for  (23).  This  also  has  a  computational  advantage,  as  (19)  or  (23) 
may  overflow  the  floating  point  number  system  for  values  of  Ee  well  beyond  the  coercive  point.  This  definition 
removes  these  floating  point  overflows  from  consideration. 


2.3.2  Conversion  of  Integrals  to  Exponentials  and  Error  Functions 

In  the  stress-invariant  thermal  activation  case,  the  integrals  in  (17)  and  (19)  can  be  expressed  in  terms  of 
exponential  and  complementary  error  functions  to  simplify  their  computation.  For  brevity,  this  will  only  be 
derived  for  the  positive  likelihood  and  average  polarization.  The  computation  of  negative  likelihood  and  average 
polarization  relations  is  analogous.  Consider  the  integral 


This  can  be  rewritten  as 


a  /  exp 


=  a 


pV 

2kT 


P-  Pr 


E 


dP 


L 


{y-PR-Ee/nWvV/2kT 


(; x-PR-EJri)^riV/2kT 


exp  (— t2)  dt 


—  (X\ 


1 2kT 

pV 


E\  /  rjV 


erfc  [x  -  PR - I  \l  — —  -erfc  (y  -  PR - \  — — 


77  )  V  2  kT 


E 


r/V 


77  y  V  2 kT 


where  a  =  exp (Pj|  —  (Pr  +  P)2)-  Applying  this  to  the  likelihood  definition  and  substituting  for  Pj  yields 


To  specify  the  average  polarization,  we  note  that 


Pexp 


'Pi 


dP  =  a 


PjdP 


akT 

~yV 


r-yv 
JPl  kT 

8 


P-Pr 


jdP  +  a 


qcLP 


where  7  =  exp(—  (-P  —  Pr  —  ^)2).  The  first  integrand  is  now  the  derivative  of  7  which  yields 


2kT  J 


((■ 


erfc  -  A 

11  Tj  T) 


—  +  Pr- 
V 


(27) 


We  note  that  this  re-formulation  is  not  possible  in  the  stress-dependent  case  due  to  the  additional  terms  in  (11). 

2.3.3  Quadrature  Methods  and  Stress-Dependence 

In  addition  to  requiring  quadrature  methods  to  evaluate  (21)  and  (23),  stress-dependence  also  necessitates  care 
when  choosing  the  quadrature  method  used  to  approximate  the  coercive  field  integral  in  (4).  Note  that  vc(x) 
is  defined  only  for  x  >  0.  Some  quadrature  methods  for  this  distribution  -  such  as  a  trapezoid  or  Simpson’s 
rule  -  will  include  the  zero  endpoint.  As  long  as  ^c(0)  =  0,  this  poses  no  issue.  However,  if  ;/c(0)  7^  0,  the 
stress-invariant  formulations  will  still  yield  reasonable  results.  In  the  stress-dependent  formulation,  vc(Q)  7^  0 
gives  dipoles  whose  coercive  point  from  90°  to  positive  is  the  same  as  from  90°  to  negative,  i.e.,  dipoles  try  to 
switch  both  directions  at  once.  This  yields  nonphysical  and  numerically  ambiguous  results.  For  this  reason, 
implementation  of  the  stress-dependent  formulation  must  avoid  vc  quadrature  points  at  0  either  by  enforcing 
vc{ 0)  =  0  or  by  employing  a  quadrature  method  that  does  not  use  the  left  endpoint.  Gaussian  quadrature  is  a 
natural  choice  for  this  reason  as  well  as  for  accuracy. 

2.3.4  Avoidance  of  Overflow 

A  final  note  regards  the  calculation  of  (21)  and  (23).  Whereas  the  equations  themselves  are  well-defined,  the 
numerators  and  denominators  may  grow  quite  large  or  small,  leading  to  either  a  numerical  overflow  or  under¬ 
flow/division  by  zero.  This  occurs  more  frequently  for  large  V/kT.  As  discussed  in  [14],  the  activation  behavior 
approaches  the  negligible  relaxation  model  as  V/kT  — >  00.  Thus,  overflow  is  dealt  with  by  substituting  the 
limit  -  the  same  energy  minimization  performed  in  the  negligible  activation  case  -  whenever  it  occurs. 

2.4  Numerical  Cost  of  Kernels 

The  run-times  for  algorithms  directly  implementing  the  kernels  P  specified  in  (13),  (16)  and  (22)  are  summarized 
in  Table  1.  We  note  that  inclusion  of  stress-dependence  results  in  an  algorithm  that  runs  approximately  25  times 
slower  whereas  the  inclusion  of  thermal  relaxation  requires  at  least  90  times  more  computational  effort.  In  the 
worst  case,  inclusion  of  both  relaxation  and  stress-dependence  is  over  28,000  times  slower  than  assuming  both 
are  negligible. 

It  should  be  emphasized  that  the  run-time  depends  greatly  on  several  implementation  criteria.  For  example, 
in  the  stress-dependent  negligible  relaxation  case,  the  roots  of  a  cubic  polynomial  must  be  computed.  These  may 
be  computed  through  equations  for  cubic  polynomials  or  through  eigenvalue  calculation  (the  MATLAB  roots 
command  employs  the  latter  approach).  The  choice  of  method  will  impact  the  run-time  but  does  not  change  the 
qualitative  difference  in  computation  time  between  formulations. 


Implementation  Algorithm 
stress-invariant,  negligible  relaxation 
stress-invariant,  relaxation 
stress-dependent,  negligible  relaxation 
stress-dependent,  relaxation 


Run-time  (s) 
2.758  x  101 
2.448  x  103 
6.751  x  102 
7.880  x  105 


Table  1:  Run-times  employing  a  direct  implementation  the  kernels  P  specified  in  (13),  (16)  and  (22).  Run-times 
are  for  120,000  temporal  iterations  and  80  quadrature  points  for  each  of  the  integrals  in  (4). 
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3  Look-up  Tables  for  Faster  Computation 

It  has  been  demonstrated  that  inclusion  of  stress-dependencies  or  thermal  relaxation  mechanisms  increases  the 
computation  effort  (defined  as  floating  point  operations)  by  over  an  order  of  magnitude  in  all  cases.  For  many 
control  systems,  this  will  preclude  the  possibility  of  real-time  implementation.  To  improve  efficiency,  the  model 
can  be  approximated  to  within  an  arbitrary  degree  of  accuracy  by  employing  look-up  tables.  This  effectively 
trades  memory  for  computation  time.  The  necessary  lookup  tables  will  be  derived  for  thermal  relaxation,  stress- 
dependence,  and  combined  relaxation  and  stress-dependent  formulations. 


3.1  Algorithm  for  Negligible  Relaxation,  Stress-Invariant  Regimes 

Regimes  with  negligible  relaxation  and  stresses  do  not  require  lookup  tables.  However,  the  computation  time 
may  be  halved  through  algorithm  optimizations.  Note  that  the  discretized  model  may  be  written  as 


=  £  f>[» ]»'[*](-  +  —  +  W[i. il) 

p  Nc  N,  Nc  N, 

=  -  E  E  +  E  E  wc\i\WI[j\  — T— 

'  i=  1  j  =  1  2=1  j= 1  ' 


Nc  iVj 


EE  Wc[i]wi[j]PR8[i,j]. 

(=i  j= l 


(28) 


In  this  form,  the  first  term  is  a  constant  times  the  input  field,  whereas  the  second  is  constant.  Only  the  third 
term  will  depend  on  each  quadrature  point.  Premultiplying  either  wc  or  wi  by  Pr  further  reduces  the  number  of 
computations  per  time  step.  These  changes  are  reflected  in  Algorithm  1.  Similar  optimizations  were  performed 
for  each  of  the  following  formulations. 


3.2  Algorithm  for  Thermal  Relaxation,  Stress-Invariant  Regimes 

As  shown  in  Section  2.4,  the  majority  of  the  computational  effort  for  the  stress-invariant  relaxation  model  is 
spent  in  the  computation  of  the  average  polarization  and  likelihood  relations  (17)  and  (19).  If  it  were  possible  to 
calculate  and  store  these  values  in  advance,  the  per  iteration  time  would  decrease  significantly.  The  input  field 
E  is  bounded  by  physical  constraints  whereas  the  quadrature  points  must  also  be  bounded  due  to  the  decay  of 
vc  and  vi  as  posited  in  (3).  Thus,  the  values  associated  with  each  exp  and  erfc  evaluation  are  bounded  and  this 
bound  can  be  determined  a  priori.  This  defines  the  lookup  table  approach;  before  running  the  algorithm,  the 
range  between  each  input  is  discretized,  and  the  values  of  (17)  and  (19)  are  computed  and  stored  for  every  grid 
element.  With  even  a  moderate  number  of  time  steps,  this  reduces  the  overall  computational  effort;  however,  the 


#  Initial  Setup  -  to  be  done  in  advance 

addit  =  ESo"1  M 

«w  =  £tco-1  h 

For  (j  =  0 . . .  Ni  —  1),  wi\j\  =  wi[j]Pn  End  For 

#  Begin  Iteration 

For  k  =  0  . . .  length(E)  —  1 
P[k]  =  addit  +  WsumE 
For  j  =  0  . . .  Ni  —  1 
out  =  0 

Ee  =  E  +  Ei  [j] 

For  i  =  0  ...  Nc  —  1 


If  Ee  +  Ec[i\6[i,j]  >0 
S[i,j]  =  1; 

out  =  out  +  Wc  [j] 
Else 

<5[*,j]  =  -1 

out  =  out  —  wc  [ i ] 
End  If 
End  For 

P[k]  =  P[fc]  +  wi[j\  out 
End  For 
End  For 


Algorithm  1:  Implementation  algorithm  for  regimes  with  negligible  relaxation  and  no  applied  stress. 
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main  purpose  is  to  replace  computationally  intense  real-time  components  of  the  algorithm  by  offline  computations 
and  efficient  memory  usage. 

It  may  appear  that  the  resulting  grid  is  three  dimensional  with  one  dimension  for  each  of  the  input  field  E, 
interaction  field  Ej  and  coercive  field  Ec.  However,  E  and  Ej  are  related  by  Ee  =  E  +  Ej  and  as  such  will 
always  share  a  single  dimension  in  the  grid.  Further,  these  variables  always  occur  as  the  combination 

Etmp-  =  E  +  Ej  —  Ec 

in  the  relations  for  (P-)  and  p _ |_  whereas  for  (P+)  and  p _| _ the  variables  occur  as 

Etmp+  =  E  +  Ej  +  Ec. 

We  therefore  compute  the  bounds  of  Etmp-  and  Etmp+,  discretize  these  variables,  and  calculate  four  1-D  tables 

of  values,  one  for  each  of  (P_),  p _ |_,  (P+)  and  p. ( _ Further  memory  can  be  saved  by  noting  that  the  likelihoods 

need  not  be  calculated  for  values  of  Etmp-  >  0  or  Etmp+  <  0  as  they  are  defined  to  be  constant  in  these  cases 
(see  Section  2.3).  Pseudocode  for  this  formulation  is  given  in  Algorithm  2. 

It  should  be  noted  that  the  likelihoods  and  average  polarizations  need  not  use  the  same  discretization  of 
Etmp-  and  EtmP+',  in  fact,  the  best  accuracy  for  a  given  memory  usage  can  be  obtained  by  storing  more  points 
for  the  likelihoods  than  for  the  average  polarizations.  This  is  due  to  the  likelihoods  being  exponential  functions 
whereas  the  average  polarizations  are  nearly  linear. 

Memory  usage  and  algorithmic  accuracy,  defined  as  the  difference  between  the  lookup  table  approximation 
and  a  direct  model  implementation,  are  linearly  related.  However,  the  number  of  computations  performed  by 
the  real-time  portion  algorithm  is  independent  of  the  lookup  table  size;  the  difference  is  completely  contained  in 
the  offline  setup. 

3.3  Algorithm  for  Stress-Dependent,  Negligible  Relaxation  Regimes 

When  stress-dependence  is  considered  in  the  absence  of  relaxation,  there  are  no  likelihoods  to  compute.  However, 
minimizing  (11)  requires  computing  zeros  of  a  cubic  polynomial.  Given  the  piecewise  definition  of  tpp,  this 


#  Initial  Setup  -  to  be  done  in  advance 
Determine  ranges  for  each  average  and  likelihood 

and  calculate  look-up  tables 

#  Note:  fi  =  \j2kT /r/V 

#  (P-)  =  -{/3/Vn)exp((-EtmP-/vl3).2)/erfc(EtmP-/ril3) 

#  (P+)  =  {P/y/^)exp{{-EtmV+/riP).2)leric(Etmp+/riP) 

#  P-+  =  (At/ t)(1  —  erfc ((EtmP-/eta  +  e)/beta) 

#  / er f  c{Etmp-  / etafi)  for  Ee  <  Ec 

#  p+-  =  (Af/r)(l  -  eric((-EtmP+/eta  +  e)/beta) 

#  /erf c{-EtmV+/ etafi)  for  Ee  >  -Ec 

ff  Note:  we  assume  a  single  stepsize  and  Ee  range  for 

#  brevity.  This  is  not  the  best  choice  for  accuracy,  but  the 

#  extension  is  straightforward. 

For  (i  =  0  . . .  Nc  —  1),  Ec[i]  =  Ec[i]/Eatep  End  For 
For  (j  =  0  .  . .  Ni  -  1),  Ei\j]  =  Ei[j]/E„tep  End  For 

addit  =  Ei^o”1  Ey^o"1  ]£/[?]  /r]  -  PRT]W3um 

Pr  =  2  Pr 


//  Begin  Iteration 
For  k  =  0. . .  length(E)  —  1 
P[k]  =  E[k]w3um  +  addit 

base  —  (-£/[&]  Eef  fmin) / Estep 

For  /  —  ()...  Ni  —  1 
out  =  0 

base  2  =  base  +  Ei[j ] 

For  i  =  0  . . .  Nc  —  1 
h-  =  round(&ase2  —  Ec[i]) 
h+  =  round(foase2  +  Ec[i]) 

x+  [*j  j]  =  (x+[i,j}+p-+[h-})/(l+p-+[h-}+p+-[h+]) 
out  =  out  +  wc[i](x+[i,  j\(Ppos[h+]  -  pneg[h-]  +  Pr) 

pPneg[h  —  ]) 

End  For 

P[k]  =  P[fc]  +  wi\j\  out 
End  For 
End  For 


Algorithm  2:  Implementation  algorithm  for  the  model  with  thermal  relaxation  but  negligible  stress-dependence. 
For  brevity,  all  step  sizes  and  value  ranges  are  treated  as  equal. 
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effectively  yields  three  different  polynomials  to  minimize  corresponding  to  the  three  possible  dipole  orientations. 
We  perform  the  minimization  corresponding  to  each  state  in  advance  and  store  it  in  a  lookup  table.  In  this  case 
there  are  two  independent  variables  in  (11)  that  cannot  be  combined:  the  input  field  Ee  and  input  stress  u. 
Thus,  this  formulation  requires  three  2-D  tables. 

As  in  previous  algorithms,  memory  can  be  conserved  by  noting  that  some  values  need  not  be  computed.  For 
values  of  Ee  <  —EC[NC],  the  minima  corresponding  to  positive  orientations  need  not  be  computed  because  the 
dipole  will  have  always  switched  to  a  different  orientation.  Likewise  negative  orientations  need  not  be  computed 
for  Ee  >  EC[NC],  and  the  ninety  degree  orientation  does  not  need  computed  for  values  at  either  extreme.  This 
reduces  the  memory  requirement  of  the  table.  Implementation  of  this  approach  is  detailed  in  Algorithm  3. 

Computation  in  the  real-time  portion  of  the  algorithm  is  independent  of  lookup  table  size,  as  in  Algorithm  2. 
However,  this  algorithm  also  possesses  a  way  to  trade  a  small  amount  of  computation  time  for  increased  accuracy, 
while  holding  the  table  size  constant.  The  second  derivative  of  the  energy  is  small  -  i.e. ,  the  minimized  energy 
corresponding  to  each  dipole  state  is  nearly  linear.  Much  better  accuracy  can  be  obtained  by  linearly  interpolating 
between  the  table  values.  Since  these  values  occupy  a  2-D  grid,  this  is  efficiently  approximated  by  interpolating 


#  Initial  Setup  -  to  be  done  in  advance 

Calculate  the  minimum  Gibb’s  energy  tables  at  the  given 

Ee  and  a  values  for  each  possible  orientation. 

Call  these  Ppos,  P90,  and  Pneg 
c90posconst  =  Pi(2q2P2  —  3aqPi  +  a2  —  r/Y)/Y  +  t)Pr 
c90poscoeff  =  (2  qPj  —  a)/ Y 

c90posconst  =  —Pi(2q2P?  +  3  aqPi  +  a2  —  rjY)/Y  —  t/Pr 

c90poscoeff  =  —(2  qPj  +  a)/ Y 

For  (3  =  0  ...  Ni  -  1),  Ei\j]  =  Ei\j]/ Estep  End  For 

#  Begin  Iteration 

For  k  =  0  . . .  length(A)  —  1  #  length(A)  =  length(a) 
c90pos  =  c90poscoef  fo[k]  +  c90posconst 
c90neg  =  c90negcoef  fa[k]  +  c90negconst 
P[k ]  =  0 

o-iTld  =  round((<r[fc]  -  amirl)/astep) 

Epostmp  —  (E[k\  - 

Eposmin')  /  Estep 

Eggtmp  =  (E[k]  —  E  ninety  min')  /  Estep 
Efiegtmp  =  (E[k\  -  Enegmin  )  /  EsteP 

For  j  =  0  . . .  Ni  —  1 
out  =  0 

Ee  =  E[k\  +  Er[j] 

Eposind  —  round  (Epost 

mp  +  Ei[j}) 

Egoind  —  YO\Hld(Eninetytmp  "b 
E  negind  —  rOlUld  (E-negtmptmp  +  Ei[j}) 

For  i  =  0  . . .  Nc  —  1 
If  S[i,j]  ==  1  #  positive 
li  Ee  <=  c90neg  and  Ee  >  —Ec[i\ 

5[i,j]  =  0 

out  =  out  +  Wc[i]P9o[Eg0ind 
Else  If  Ee  <  -Ec[i] 


=  -1 

OUt  —  OUt  Wc\i\Pn.eg\Enegind>  (?ind\ 

Else 

Out  —  Out  “1“  Wc\i\Ppos\Eposindi  0~ind\ 

End  If 

Else  If  5[i,j]  ==  0  #  90  degree 
If  Ee  >  Ec[i } 

8[iJ]  =  1 

Ollt  —  OUt  ~b  W c\^\Epos\EpOSind 

Else  If  Ee  <  —Ec[i] 

1  =  -1 

OUt  —  OUt  Wc[i\Pneg\Enegind^  O'ind] 

Else 

out  =  out  +  Wc[i]P9o[Eaoind,  Oind\ 
End  If 

Else  #  90  negative 
If  Ee  >=  c90pos  and  Ee  <  Ec[i] 

5[i,j]  =  0 

out  =  out  +  Wc[i]P9o{E90ind,  &ind] 
Else  If  Ee  >  Ee[i\ 

S[i,j]  =  1 

OUt  —  OUt  -(-  Wc[i\Ppos[Eposindi  &ind\ 

Else 

OUt  —  OUt  “1“  We  [f]  Pneg  \Enegind ,  0~ind\ 

End  If 
End  If 
End  For 

P[k]  =  P[k ]  +  u>i[j]  out 
End  For 
End  For 


Algorithm  3:  Implementation  algorithm  for  model  with  stress-dependence  but  negligible  relaxation. 
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along  one  dimension  and  then  the  other.  Allowing  E\  and  to  represent  the  closest  table  value  below  Ee  and  cr, 
and  E2,  cr 2  to  be  the  values  just  above  the  desired  value,  this  can  be  written  in  one  equation  as 


(P)  =Pn  1  - 


EK  —  Ei 


E, 


+  P21  (  1  - 


step 


EK  —  Ei 


1  - 


a  —  a  1 


E 


step 


® step 

a  —  G\ 
- b 

G step 


P 


E e.  —  Ei 


12  ~ 


E, 


step 


1  - 


a  —  <j\ 
O step 


22" 


(Ee  -  E1)(a  -  ai) 

P step®  step 


(29) 


Here  ( P )  =  (P_) ,  (Pg0)  or  (P+),  values  from  the  table  are  given  by  Pn  =  (P)  (P^oy),  Pi2  =  (P)  (E1,a2), 
etc.,  Estep  is  the  distance  between  two  field  values  in  the  table,  and  astep  the  distance  between  stress  values.  In 
the  numerical  tests  detailed  in  Section  4,  this  was  found  to  increase  the  accuracy  by  2  to  3  orders  of  magnitude 
without  significantly  altering  the  computation  time. 


3.4  Algorithm  for  Thermal  Relaxation  and  Stress-Dependent  Regimes 

The  use  of  a  look-up  table  becomes  less  effective  when  relaxation  and  stress-dependent  effects  are  both  included. 
Here,  the  average  polarizations  and  switching  likelihoods  depend  on  Ee ,  Ec ,  and  a.  Unlike  the  negligible  stress 
case,  however,  the  inclusion  of  addition  terms  in  (11)  no  longer  allows  Ee  and  Ec  to  be  combined  into  a  single 
variable.  This  necessitates  a  3-D  lookup  table  for  each  of  the  average  polarizations  and  stress  values,  yielding  a 
total  of  6  tables.  The  Ec  dimension  can  be  accommodated  exactly  by  using  one  value  at  each  of  the  quadrature 
points  for  the  vc  distribution,  so  that  error  is  only  introduced  on  Ee  and  cr.  However,  this  will  likely  require  25- 
100  times  more  memory  than  the  stress-only  variant.  Further,  while  it  does  not  effect  the  real-time  portion  of  the 
algorithm,  calculation  of  the  look-up  table  can  be  computationally  expensive,  which  effects  parameter  estimation 
techniques.  Nevertheless,  the  look-up  table  does  provide  three  to  four  orders  of  magnitude  improvement  over  a 
direct  implementation  in  the  real-time  portion  of  the  computations.  Implementation  for  this  case  is  summarized 
in  Algorithm  4. 


#  Initial  Setup  -  to  be  done  in  advance 
Determine  ranges  for  each  average  and  likelihood 

and  calculate  look-up  tables  via  quadrature 

#  Note:  we  assume  a  single  stepsize  and  Ee  range  for 

#  brevity.  This  is  not  the  best  choice  for  accuracy,  but  the 

#  extension  is  straightforward. 

For  (j  =  0  ...  Ni  -  1),  P/[j]  =  Ej[j)IEsteV  End  For 

#  Begin  Iteration 

For  k  =  0  . . .  length(P)  —  1 
P[k]  =  0 

OjVQtmp  —  (P[^]  Eef  f  min')  /  Estep 

Oi  nd  =  round(<r[fc]  —  O'min  )/ O' step 
bCLSS  —  (F/[fc]  Eef  frnin)  /  Estep 

For  j  =  0  . . .  Ni  —  1 
Eind  =  round(6ase  +  Ei\j\) 


out  =  0 

For  *  =  0 . . .  Nc  —  1 

a+  =  1  +  P90+[Eind,  Oind,  *]  +  P+90  [Eind  1  O' ind-, 

OL—  =  1  P90—[Eindi  Oindi  P—9o[Eindi  Oindi  i\ 

x+[i,j]  =  («-£+[*,  j]  +  P90+{Eind,  Oind,i](a- 
-X- [i,  j ]  -  pqo—  [Eind,  Oind,  *]))/ (a+a_ 

P90+  [ Eind ,  Oind,  ^]p90—  [Eind,  Oind,  I]  ) 

=  {X-[i,  j}+P90n[Eind,Oind,i\{l-X+[i,  j])) / Ot- 

OUt  =  out  +  Wc[i](x-[i,  j](Pneg[Eind,Oind,i\ 

P9O  [Eind,  Oind,  I])  “t“  X+  [i,  j)  (Ppos  [Eind,  Oind ,  t[ 
P9O  [Eind ,  Oind,  t[)  T  P9O  [Eind ,  (J ind,  t[  ) 

End  For 

P[k]  =  P[k ]  +  wi[j]  out 
End  For 
End  For 


Algorithm  4:  Implementation  algorithm  for  the  model  with  relaxation  and  stress-dependence.  All  step  sizes  and 
value  ranges  in  the  look-up  table  are  treated  as  equal  for  brevity. 
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(a)  (b) 


Figure  4:  Inputs  used  for  the  direct  implementation  of  the  model  and  Algorithms  1-4  based  on  lookup  tables, 
(a)  Input  electric  field  -  this  pattern  repeated  20  times,  and  (b)  Input  stress,  where  applicable 


4  Lookup  Table  Approximation  Run-times  and  Error 

Compared  to  the  direct  “exact”  implementation  of  the  homogenized  energy  model,  the  lookup  table  (LUT) 
implementations  yield  significantly  better  performance.  Algorithms  1-4  were  implemented  using  the  same  para¬ 
meters  and  iterations  as  the  direct  implementations  given  in  Table  1.  All  computations  were  performed  using  a 
Pentium  IV  Xeon  1.7  GHz  processor  running  Linux.  Algorithms  1-4  were  implemented  in  a  C  language  MATLAB 
mex  file  (available  from  http://research.coire.net).  The  run-times  for  the  input  shown  in  Figure  4  are  given  in 
Table  2.  For  each  of  these  tests,  the  lookup  table  size  was  limited  to  1000  elements  per  table  for  Algorithms  2 
and  3  and  80,000  elements  per  table  -  1000  for  Ee  x  a  times  80  values  for  Ec  -  for  Algorithm  4.  In  all  cases,  the 
implementation  algorithms  employed  the  parameters  given  in  [2].  When  reporting  results,  we  let  N ^  denote 
the  number  of  quadrature  points  used  when  approximating  the  average  polarizations  ( P_ ),  (P90),  (-P+)  in  (17) 

or  (21)  and  Np  denote  the  number  of  quadrature  points  employed  for  the  likelihoods  p _ p. | _ ,  p_ 90,  P90-,  P+90, 

P90+  in  (19)  or  (23) 

We  first  compare  the  run-times  for  the  lookup  table  approximations  with  the  direct  implementation.  Al¬ 
gorithm  1  ran  in  one-third  of  the  time  of  the  unoptimized  version  on  this  platform  whereas  Algorithm  2  ran 
almost  50  times  faster  than  a  direct  implementation.  With  stress-dependence,  Algorithm  3  ran  10-12  times  faster 
(depending  on  whether  the  setup  time  is  included)  and  Algorithm  4  ran  either  4000  or  13,800  times  faster,  again 
depending  on  whether  or  not  setup  time  is  included  in  the  comparison.  Thus,  the  lookup  table  approximations 


Algorithm 

Setup  time  (s) 

Run-time  (s) 

Neg.  relaxation,  no  stress 

0 

9.03 

Relaxation,  no  stress 

0.09 

50.36 

Neg.  relaxation,  stress  dep 

.  ( a  constant) 

0.09 

10.50 

Neg.  relaxation,  stress  dep 

.  (<r  varying) 

0.08 

10.91 

Relaxation,  stress  dep.  ( a 

constant) 

12.64 

56.40 

Relaxation,  stress  dep.  (ct 

varying) 

122.64 

57.12 

Table  2:  Setup  time  and  run-times  for  Algorithms  1-4,  computed  on  a  Pentium  IV  1.7  GHz  processor  run¬ 
ning  Linux,  with  the  algorithms  implemented  as  C  language  MATLAB  mex  files  and  120,000  time  steps  being 
computed. 
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Figure  5:  Comparisons  of  polarization  given  by  direct  model  implementation  and  algorithms  based  on  lookup 
tables  (LUT).  (a)  Stress-invariant,  relaxation  algorithm  (Algorithm  2).  (b)  Stress-dependent,  relaxation  algo¬ 
rithm  (Algorithm  4).  (c)  Stress-dependent  negligible  relaxation  algorithm  (Algorithm  3)  and  (d)  approximation 
error  with  and  without  interpolation. 


yield  anywhere  from  1  to  4  orders  of  magnitude  improvement  depending  on  which  formulation  is  employed. 

The  level  of  accuracy  afforded  by  the  lookup  tables  can  be  adjusted  without  altering  the  run-time,  although 
memory  usage  and  setup  time  increases  with  accuracy.  To  illustrate  this,  Algorithm  2  was  compared  with  a 
direct  implementation  of  the  stress-invariant  relaxation  model  using  parameters  taken  from  [2].  The  results  are 
summarized  in  Table  3  and  Figure  5(a).  The  number  of  quadrature  points  and  Np  were  adjusted  separately. 
The  table  suggests  that  memory  is  best  used  by  allowing  larger  tables  for  likelihoods  than  average  polarizations. 
We  note  that  computation  time  does  not  increase  as  the  table  size  increases.  Surprisingly,  the  table  does  not 
show  setup  time  increases  as  memory  increases.  This  is  somewhat  misleading  since  the  calculation  is  rapid,  and 
the  clock  on  the  computer  does  not  have  sufficient  resolution  to  resolve  the  differences  as  memory  increases. 

The  trends  for  Algorithm  3  are  similar  and  yield  the  errors  summarized  in  Table  4.  In  this  case,  we  include 
error  values  for  both  rounding  and  interpolating.  Setup  time  does  increase  as  the  amount  of  memory  increases. 
Since  the  lookup  tables  are  two  dimensional,  the  number  of  points  used  for  both  the  input  field  and  input  stress 
are  given.  In  addition  to  noting  the  consistent  run-times  for  all  table  sizes,  we  note  that  interpolation  did  not 
increase  the  run-time  of  the  algorithm.  This  is  surprising  in  the  sense  that  interpolation  requires  two  to  four  times 
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N(P) 

Np 

Setup  time  (s) 

Run-time  (s) 

RMS  error  (%) 

500 

500 

0.06 

54.12 

3.229 

500 

1000 

0.07 

53.67 

3.047 

500 

2000 

0.07 

55.37 

2.967 

1000 

500 

0.06 

53.43 

3.230 

1000 

1000 

0.06 

54.13 

3.048 

1000 

2000 

0.07 

55.78 

2.968 

2000 

500 

0.07 

53.80 

3.230 

2000 

1000 

0.07 

54.54 

3.049 

2000 

2000 

0.07 

55.87 

2.969 

Table  3:  Error  between  the  direct  model  implementation  and  Algorithm  2  for  regimes  with  thermal  relaxation 
but  negligable  stress-dependencies.  Errors  are  root  mean  square  (RMS)  values  expressed  as  percentages  of  the 
saturation  polarizations  (0.3  C/m2).  Run-times  are  for  120,000  time  steps. 


the  memory  access  and  adds  several  additional  additions  and  multiplies  to  each  kernel  evaluation.  Investigation 
showed  the  computation  had  a  bottleneck  with  the  storage  of  <5  values  to  memory  and  apparently  the  extra 
computation  required  for  interpolation  was  performed  when  the  processor  would  otherwise  have  been  waiting 
for  access.  This  is  both  an  architecture  and  a  compiler-dependent  feature,  and  the  same  performance  may  not 
be  seen  on  other  platforms.  However,  the  accuracy  gained  by  interpolation,  2  to  3  orders  of  magnitude,  is  not 
architecture  dependent.  It  will  vary  if  different  material  parameters  are  employed  but  the  qualitative  behavior 
should  remain  regardless  of  architecture.  As  such,  interpolation  provides  an  effective  way  to  limit  lookup  table 
size  while  improving  computation  performance.  These  results  are  also  plotted  in  Figure  5(c)  and  (d)  to  provide  a 
comparison  to  the  direct  implementation  and  illustrate  the  error  introduced  by  the  lookup  table  approximation. 

Given  the  accuracy  gained  by  interpolation,  it  is  reasonable  to  ask  why  it  is  not  performed  on  Algorithm  2  or  4. 
The  likelihoods  (19)  and  (23)  are  exponential;  linear  interpolation  as  employed  for  Algorithm  3  will  overpredict 
the  likelihoods,  yielding  moments  that  switch  too  soon.  Each  moment  that  switches  prematurely  adds  2 Pr  to  the 
approximation  error.  As  such,  linear  interpolation  does  not  yield  a  significant  increase  when  relaxation  effects  are 
included.  This  also  explains  the  slower  convergence  of  the  approximations  including  thermal  relaxation;  small 
errors  present  in  the  likelihood  calculation  dictate  small  errors  in  the  dipole  states  which  give  larger  errors  in  the 
output  polarization. 

Table  5  and  Figure  5(b)  illustrate  the  accuracy  of  Algorithm  4  for  regimes  which  include  both  stress- 
dependence  and  relaxation.  In  terms  of  accuracy  and  run-time,  the  results  are  comparable  to  Algorithms  2 
and  3.  However,  in  terms  of  setup  time,  this  algorithm  is  significantly  slower  and  uses  significantly  more  mem- 


Field  points 

Stress  points 

Setup  time  (s) 

Rounding 

Run-time  (s)  error  (%) 

Interpolating 

Run-time  (s)  error  (%) 

125 

125 

0.95 

10.36 

0.01258 

10.61 

9.26  x  10"b 

250 

125 

1.52 

10.45 

0.00922 

10.44 

2.53  x  10"5 

500 

125 

3.04 

10.34 

0.00907 

10.45 

8.71  x  10"6 

125 

250 

1.84 

10.42 

0.01000 

10.45 

9.03  x  10-5 

250 

250 

3.67 

10.38 

0.00514 

10.40 

2.30  x  10"5 

500 

250 

7.40 

10.49 

0.00487 

10.70 

6.26  x  10"6 

125 

500 

4.89 

10.60 

0.00922 

10.66 

8.97  x  10-5 

250 

500 

9.79 

10.50 

0.00342 

10.71 

2.24  x  10"5 

500 

500 

19.71 

10.56 

0.00298 

10.63 

5.67  x  10-6 

Table  4:  Comparison  of  the  direct  and  look-up  table  stress-dependent,  negligible  relaxation  Algorithm  3.  Errors 
are  root  mean  square  values  expressed  as  percentages  of  the  saturation  polarizations  (0.3  C/m2).  Run-times  are 
for  120,000  time  steps. 
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Field  points 

Stress  points 

Setup  time  (s) 

Run-time  (s) 

RMS  Error 

25 

25 

107.20 

0.28 

2.42363 

50 

25 

173.97 

0.28 

1.47033 

100 

25 

307.18 

0.29 

1.16744 

25 

50 

215.55 

0.28 

2.52963 

50 

50 

343.12 

0.28 

1.21354 

100 

50 

615.48 

0.29 

1.06211 

25 

100 

426.51 

0.28 

2.53719 

50 

100 

693.13 

0.28 

1.21641 

100 

100 

1225.09 

0.29 

1.06611 

Table  5:  Comparison  of  the  direct  and  look-up  table  for  the  stress-dependent,  relaxation  Algorithm  4,  with  a 
varying  input  stress  and  80  Ec  quadrature  points.  Errors  are  root  mean  square  values  expressed  as  percentages 
of  the  saturation  polarizations  (0.3  C/m2).  Run-times  are  for  600  time  steps. 


ory.  This  is  an  unavoidable  attribute  of  the  method.  Note  that  Table  5  shows  run-times  for  only  600  temporal 
iterations.  This  amount  was  chosen  to  keep  the  run-time  of  the  direct  implementation  algorithm  manageable. 
As  shown  in  Table  2,  Algorithm  4  could  complete  120,000  iterations  in  just  under  a  minute,  not  including  the 
setup  time  for  the  desired  table  size. 

5  Comparison  to  Experimental  Data 

In  Section  4,  we  demonstrate  that  through  the  use  of  lookup  tables,  the  model  formulations  which  incorporate 
thermal  relaxation  and  stress  effects  can  be  implemented  in  a  highly  efficient  manner  while  maintaining  the 
accuracy  of  the  original  model.  We  summarize  here  the  performance  of  the  framework,  as  implemented  using  the 
algorithms  developed  in  Section  3,  for  characterizing  the  magnetic  after-effects  and  stress-dependencies  illustrated 
in  Figure  1.  We  note  that  the  unified  nature  of  the  framework  permits  a  direct  extension  of  Algorithms  1-4  to 
ferromagnetic  regimes. 

In  Figure  6,  we  illustrate  the  capability  of  the  magnetic  model,  implemented  via  Algorithm  2,  to  characterize 
data  collected  from  a  cylindrical  steel  rod  having  a  length  of  2  inches  and  diameter  of  0.25  inches.  Specifically,  it 
is  noted  that  by  incorporating  relaxation  mechanisms,  the  model  characterizes  the  magnetic  after-effects  which 
yield  negative  differential  susceptibilities  following  field  reversal  at  the  beginning  of  minor  loops.  Additional 
details  regarding  both  the  experimental  setup  and  model  performance  are  provided  in  [3] . 

The  performance  of  the  stress-dependent  polarization  model,  implemented  via  Algorithm  3,  is  illustrated  in 
Figure  7  through  comparison  with  PLZT  data  from  [9].  Parameters  were  estimated  through  a  least-squares  fit 
to  data  collected  at  prestress  levels  of  no  =  0  MPa,  a0  =  —6  MPa,  cto  =  —10  MPa  and  =  —15  MPa  to 
obtain  the  present  model  fits.  The  fits  in  (b)  and  (d)  illustrate  minor  loop  attributes  of  the  data  and  model 
near  negative  remanence.  The  change  in  curvature  exhibited  by  the  data  with  <t0  =  — 15  MPa  illustrates  that 
the  manifestation  of  90°  switching  becomes  increasingly  prominent  as  prestress  levels  increase.  This  illustrates 
the  necessity  of  incorporating  90°  switching  mechanisms  in  both  the  model  and  implementation  algorithms. 
Additional  details  regarding  the  experimental  setup  can  be  found  in  [9]  whereas  the  model  extensions  required 
to  address  90°  switching  are  presented  in  [1,  2]. 

6  Concluding  Remarks 

The  homogenized  energy  model  developed  in  [11,  16,  18,  20]  provides  a  unified  framework  for  characterizing 
hysteresis  and  constitutive  nonlinearities  in  ferroelectric,  ferromagnetic  and  ferroelastic  materials.  Due  to  the 
energy  formulation  of  the  framework,  the  effects  of  thermally  activated  relaxation  phenomenon,  stress-dependence 
and  thermal  dependence  can  be  incorporated  in  a  natural  manner.  However,  direct  implementation  algorithms  for 
models  that  incorporate  these  mechanisms  presently  do  not  provide  the  efficiency  required  for  high-speed  material 
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Time  (s) 


(a) 


(b) 


Figure  6:  (a)  Input  magnetic  field,  (b)  fit  ( — )  provided 
and  (c)  detailed  view  of  fit. 


by  Algorithm  2  to  magnetization  data  ( - )  from  [3], 


characterization,  actuator  or  sensor  optimization,  or  real-time  control  implementation.  In  this  paper,  we  present 
highly  efficient  implementation  algorithms,  based  on  lookup  tables,  which  reduce  implementation  times  by  one 
to  four  orders  of  magnitude  for  models  which  incorporate  thermal  activation  and  stress-dependence. 

In  addition  to  providing  significant  improvements  in  efficiency,  the  use  of  lookup  table  approximations  hold 
a  further  advantage  for  those  using  embedded  devices  -  either  microprocessor  or  gate-array  based  -  in  which 
integer  arithmetic  is  required.  The  calculation  of  minimum  energies,  average  polarizations,  and  likelihoods  may 
involve  computation  on  very  large  or  very  small  numbers.  By  moving  these  calculations  offline,  they  may  be 
performed  within  a  processor  capable  of  floating  point  operations  and  returned  to  integer  or  fixed-point  format 
for  storage  and  real-time  processing. 

As  detailed  in  [8,  14],  a  highly  advantageous  attribute  of  the  homogenized  energy  framework  is  the  property 
that  approximate  model  inverses  can  be  constructed  with  nearly  the  same  efficiency  as  the  direct  models.  This 
permits  the  design  of  inverse  filters  which  approximately  linearize  hysteretic  actuator  or  sensor  dynamics  and 
hence  permit  linear  control  design.  The  development  and  experimental  implementation  of  inverse  filters  based 
on  the  algorithms  presented  here  is  under  investigation. 
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Figure  7:  Comparison  of  modeled  E  —  P  implemented  in  Algorithm  3  with  PLZT  data  from  [9]  for  compressive 
prestresses  of  (a)  Co  =  0  MPa,  (b)  cr0  =  —6  MPa,  (c)  do  =  —10  MPa  and  (d)  ctq  =  —15  MPa. 
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